NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Multi-Scale High-Resolution Logarithmic Grapher Module for Efficient Vision GNNs

Munir, Mustafa; Zhang, Alex; Marculescu, Radu (December 2025, PMLR)

Vision graph neural networks (ViG) have demonstrated promise in vision tasks as a competitive alternative to conventional convolutional neural nets (CNN) and transformers (ViTs); however, common graph construction methods, such as k-nearest neighbor (KNN), can be expensive on larger images. While methods such as Sparse Vision Graph Attention (SVGA) have shown promise, SVGA’s fixed step scale can lead to over-squashing and missing multiple connections to gain the same information that could be gained from a long-range link. Through this observation, we propose a new graph construction method, Logarithmic Scalable Graph Construction (LSGC) to enhance performance by limiting the number of long-range links. To this end, we propose LogViG, a novel hybrid CNN-GNN model that utilizes LSGC. Furthermore, inspired by the successes of multiscale and high-resolution architectures, we introduce and apply a high-resolution branch and fuse features between our high-resolution and low-resolution branches for a multi-scale high-resolution Vision GNN network. Extensive experiments show that LogViG beats existing ViG, CNN, and ViT architectures in terms of accuracy, GMACs, and parameters on image classification and semantic segmentation tasks. Our smallest model, Ti-LogViG, achieves an average top-1 accuracy on ImageNet-1K of 79.9% with a standard deviation of ± 0.2%, 1.7% higher average accuracy than Vision GNN with a 24.3% reduction in parameters and 35.3% reduction in GMACs. Our work shows that leveraging long-range links in graph construction for ViGs through our proposed LSGC can exceed the performance of current state-of-the-art ViGs.
more » « less
Free, publicly-accessible full text available December 13, 2026
From Data to Design: Leveraging Frequency Statistics for Efficient Neural Network Architectures

Munir, Mustafa; Li, Guihong; Rahman, Mostafijur; Zhang, Alex; Marculescu, Radu (June 2025, IEEE)

This paper delves into the frequency analysis of image datasets and neural networks, particularly Vision Transformers (ViTs) and Convolutional Neural Networks (CNNs), and reveals the alignment property between datasets and network architecture design. Our analysis suggests that the frequency statistics of image datasets and the learning behavior of neural networks are intertwined. Based on this observation, our main contribution consists of a new framework for network optimization that guides the design process by adjusting the network’s depth and width to align the frequency characteristics of untrained models with those of trained models. Our frequency analysis framework can be used to design better neural networks with better performance-model size trade-offs. Our results on ImageNet-1k image classification, CIFAR-100 image classification, and MS-COCO object detection and instance segmentation benchmarks show that our method is broadly applicable and can improve network architecture performance. Our investigation into the alignment between the frequency characteristics of image datasets and network architecture opens up a new direction in model analysis that can be used to design more efficient networks.
more » « less
Free, publicly-accessible full text available June 9, 2026
VideoGameBench: Can Vision-Language Models complete popular video games?

Zhang, Alex L; Griffiths, Thomas L; Narasimhan, Karthik R; Press, Ofir (May 2025, https://arxiv.org/abs/2505.18134)

Free, publicly-accessible full text available May 30, 2026
Balancing User Control and Perceived Robot Social Agency Through the Design of End-User Robot Programming Interfaces

https://doi.org/10.1109/HRI61500.2025.10974063

Zhang, Alex Wuqi; Queiroz, Rafael; Sebo, Sarah (March 2025, IEEE)

Perceived social agency-the perception of a robot as an autonomous and intelligent social other-is important for fostering meaningful and engaging human-robot interactions. While end-user programming (EUP) enables users to customize robot behavior, enhancing usability and acceptance, it can also potentially undermine the robot's perceived social agency. This study explores the trade-offs between user control over robot behavior and preserving the robot's perceived social agency, and how these factors jointly impact user experience. We conducted a between-subjects study (N = 57) where participants customized the robot's behavior using either a High-Granularity Interface with detailed block-based programming, a Low-Granularity Interface with broader input-form customizations, or no EUP at all. Results show that while both EUP interfaces improved alignment with user preferences, the Low-Granularity Interface better preserved the robot's perceived social agency and led to a more engaging interaction. These findings highlight the need to balance user control with perceived social agency, suggesting that moderate customization without excessive granularity may enhance the overall satisfaction and acceptance of robot products.
more » « less
Free, publicly-accessible full text available March 4, 2026
Exploring Robot Personality Traits and Their Influence on User Affect and Experience

https://doi.org/10.1109/HRI61500.2025.10973991

Zhang, Alex Wuqi; Kovacs, Clark; De_Pablo, Liberto; Zhang, Justin; Bai, Maggie; Jeong, Sooyeon; Sebo, Sarah (March 2025, IEEE)

As human-robot interactions become more social, a robot's personality plays an increasingly vital role in shaping user experience and its overall effectiveness. In this study, we examine the impact of three distinct robot personalities on user experiences during well-being exercises: a Baseline Personality that aligns with user expectations, a High Extraversion Personality, and a High Neuroticism Personality. These personalities were manifested through the robot's dialogue, which were generated using a large language model (LLM) guided by key behavioral characteristics from the Big 5 personality traits. In a between-subjects user study (N = 66), where each participant interacted with one distinct robot personality, we found that both the High Extraversion and High Neuroticism Robot Personalities significantly enhanced participants' emotional states (arousal, control, and valence). The High Extraversion Robot Personality was also rated as the most enjoyable to interact with. Additionally, evidence suggested that participants' personality traits moderated the effectiveness of specific robot personalities in eliciting positive outcomes from well-being exercises. Our findings highlight the potential benefits of designing robot personalities that deviate from users' expectations, thereby enriching human-robot interactions.
more » « less
Free, publicly-accessible full text available March 4, 2026
SWE-bench Multimodal: Do AI Systems Generalize to Visual Software Domains?

Yang, John; Jimenez, Carlos E; Zhang, Alex L; Lieret, Kilian; Yang, Joyce; Wu, Xindi; Press, Ori; Muennighoff, Niklas; Synnaeve, Gabriel; Narasimhan, Karthik R; et al (January 2025, International Conference on Learning Representations (ICLR))

Free, publicly-accessible full text available January 22, 2026
Building Scalable Video Understanding Benchmarks through Sports

Agarwal, Aniket; Zhang, Alex; Narasimhan, Karthik; Gilitschenski, Igor; Murahari, Vishvak; Kant, Yash (March 2023, https://arxiv.org/abs/2301.06866)

Full Text Available
Abraham model correlations for describing solute transfer processes into diethyl carbonate

https://doi.org/10.1080/00319104.2019.1675159

Dai, Jingyi; Eddula, Shrika; Jiang, Carina; Zhang, Alex; Liu, Kelly; Zhu, Siqi; Wang, Shang; Gupta, Avi; Churchill, Brittani; Garcia, Estefania; et al (January 2021, Physics and Chemistry of Liquids)

Full Text Available
Aging-associated sinus arrest and sick sinus syndrome in adult zebrafish

https://doi.org/10.1371/journal.pone.0232457

Yan, Jianhua; Li, Hongsong; Bu, Haisong; Jiao, Kunli; Zhang, Alex X.; Le, Tai; Cao, Hung; Li, Yigang; Ding, Yonghe; Xu, Xiaolei (May 2020, PLOS ONE)
Barbuti, Andrea (Ed.)
Full Text Available
Determination of Abraham model correlations for describing solute transfer into the methyl butyrate mono-solvent at 298 K

https://doi.org/10.1080/00319104.2019.1660983

Qian, Ellen; Wadawadigi, Anisha; Zha, Olivia; Liu, Kelly; Dai, Jingyi; Eddula, Shrika; Jiang, Carina; Zhang, Alex; Zhu, Siqi; Garcia, Estefania; et al (January 2020, Physics and Chemistry of Liquids)

Full Text Available

Search for: All records